Data-driven Random Fourier Features using Stein Effect
نویسندگان
چکیده
Large-scale kernel approximation is an important problem in machine learning research. Approaches using random Fourier features have become increasingly popular [Rahimi and Recht, 2007], where kernel approximation is treated as empirical mean estimation via Monte Carlo (MC) or Quasi-Monte Carlo (QMC) integration [Yang et al., 2014]. A limitation of the current approaches is that all the features receive an equal weight summing to 1. In this paper, we propose a novel shrinkage estimator from ”Stein effect”, which provides a data-driven weighting strategy for random features and enjoys theoretical justifications in terms of lowering the empirical risk. We further present an efficient stochastic algorithm for large-scale applications of the proposed method. Our empirical results on six benchmark data sets demonstrate the advantageous performance of this approach over representative baselines in both kernel approximation and supervised learning tasks.
منابع مشابه
Data Dependent Kernel Approximation using Pseudo Random Fourier Features
Kernel methods are powerful and flexible approach to solve many problems in machine learning. Due to the pairwise evaluations in kernel methods, the complexity of kernel computation grows as the data size increases; thus the applicability of kernel methods is limited for large scale datasets. Random Fourier Features (RFF) has been proposed to scale the kernel method for solving large scale data...
متن کاملA Data-driven Method for Crowd Simulation using a Holonification Model
In this paper, we present a data-driven method for crowd simulation with holonification model. With this extra module, the accuracy of simulation will increase and it generates more realistic behaviors of agents. First, we show how to use the concept of holon in crowd simulation and how effective it is. For this reason, we use simple rules for holonification. Using real-world data, we model the...
متن کاملDetection of high impedance faults in distribution networks using Discrete Fourier Transform
In this paper, a new method for extracting dynamic properties for High Impedance Fault (HIF) detection using discrete Fourier transform (DFT) is proposed. Unlike conventional methods that use features extracted from data windows after fault to detect high impedance fault, in the proposed method, using the disturbance detection algorithm in the network, the normalized changes of the selected fea...
متن کاملThe Error Probability of Random Fourier Features is Dimensionality Independent
We show that the error probability of reconstructing kernel matrices from Random Fourier Features for any shift-invariant kernel function is at most O(exp(−D)), where D is the number of random features. We also provide a matching informationtheoretic method-independent lower bound of Ω(exp(−D)) for standard Gaussian distributions. Compared to prior work, we are the first to show that the error ...
متن کاملNyström Method vs Random Fourier Features: A Theoretical and Empirical Comparison
Both random Fourier features and the Nyström method have been successfully applied to efficient kernel learning. In this work, we investigate the fundamental difference between these two approaches, and how the difference could affect their generalization performances. Unlike approaches based on random Fourier features where the basis functions (i.e., cosine and sine functions) are sampled from...
متن کامل